A Survey of Multilingual Text Retrieval

نویسندگان

  • Douglas W. Oard
  • Bonnie J. Dorr
چکیده

This report reviews the present state of the art in selection of texts in one language based on queries in another a problem we refer to as multilingual text retrieval Present applications of multilingual text retrieval systems are limited by the cost and complexity of developing and using the multilingual thesauri on which they are based and by the level of user training that is required to achieve satisfactory search e ective ness A general model for multilingual text retrieval is used to review the development of the eld and to describe modern production and experimental systems The report concludes with some observations on the present state of the art and an extensive bibliography of the technical literature on multilingual text retrieval The research reported herein was supported in part by Army Research O ce contract DAAL C through Battelle Corporation NSF NYI IRI Alfred P Sloan Research Fellow Award BR a General Research Board Semester Award and the Logos Corporation

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multilingual text mining approach to web cross-lingual text retrieval

To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept–term relationships from linguistically diverse textual data relevant to a domain. Second, the multilingual concept–term relationships, in turn, are used to discover the conceptual content of the multilingual text, which is either a document contai...

متن کامل

A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps

With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...

متن کامل

Discovering Parallel Text from the World Wide Web

Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and multilingual text mining. Constructing a parallel corpus requires effective alignment of parallel documents. In this paper, we develop a parallel page identification system for identifying and aligning parallel documents ...

متن کامل

Mining bilingual topic hierarchies from unaligned text

Recent years have seen an exponential growth in the amount of multilingual text available on the web. This situation raises the need for novel applications for organizing and accessing multilingual content. Common examples of such applications include Multilingual Topic Tracking, Cross-Language Information retrieval systems etc. Most of these applications rely on the availability of multilingua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996